Introduction
This chapter introduces the dissertation by describing its context, the identified challenges, how the chosen challenge was met, the achieved impact, relevant publications, and how the thesis is structured. 1.1 Context In science data is the essential focal point in todays computational and quantitative approaches to scientific knowledge gain. Computational simulations enable far reaching explorations of modeled realities while quantitative methods gather data to improve the understanding of observed phenomena. These methods are increasingly viable only via high-end storage and large-scale High Performance Computing resources with individual requirements dramatically rising. Data throughputs involve gigabytes per second continuously, volumes are of petabyte magnitude, continuous files per second rates are in the double-digit range, and a vast universe of complex data representations exists. The great potential of such data is evident by the current trend of Big Data in science that aims at large-scale information extraction to foster scientific discoveries. This is fundamentally enabled by intelligently handling data and by combining a large variety of information technology methods to so-called data life cycles. In principle, these consist of data sources, systems to manage data as well as compute resources, methods for access rights management, utilization interfaces and data sinks. Scientists are naturally focused on their particular research. Thus, metadata is an essential step forward in the efficiency of use as it enables managing data based on its content instead of location. Via specific data life cycles scientists are freed from the necessity to extensively deal with IT infrastructures while still utilizing them to drive their research by handling their extensive data and computing demands. In this complex technological environment, a plethora of significant challenges presents itself that hinders the advancement of the state-of-the-art in data-driven knowledge gain. 1.2 Challenges Vital challenges in managing data life cycles are manifold. Federated authentication and authorization infrastructures need to be integrated while being mindful of the overall resilience of increasingly complex data life cycles. The increasing numbers of files and data amounts need to be managed by Big Data systems. These in turn need to be efficiently integrated with High Performance Computing resources for analysis which signifies the need for advanced interoperability. Besides automated pre- and postprocessing, the user-friendly creation, and execution of workflows to encapsulate complex analysis procedures need to be supported. Integrated scientific environments need to be provided that hide the underlying complexity while enabling that use. Essential is also the building of trust that an infrastructure delivers 6 1. INTRODUCTION what it promises. Closely connected is moving from a fixed-term build up phase to a sustainable operation phase. As these goals are partly opposing to each other, a effective balance between them needs to be developed for each data life cycle. The dissertation focuses on the major challenge of the organization of large numbers of files in the million range using information about data, so-called metadata. Currently, solutions are often either use case specific or lacking completely, thus, preventing easy access and re-use. Without metadata, users have to remember where an individual file is located. With a large number of files this is inefficient if not impossible. This especially holds true for Big Data use cases with a large number of files with complex content and stored in distributed locations. Currently, significant efforts need to be made to implement even narrowly applicable and pragmatic metadata handling solutions for every new scientific experiment.
ABSTRACT
The study was carried out to determined ways that could be used for improving collaboration be...
ABSTRACT
In recent years there has been an increased awareness to conserve energy through efficient use of fuels, energy saving devices a...
ABSTRACT
The main focus of this study is to find out the public perception of the CBN ban polic...
HISTORICAL PATTERNS OF GOVERNANCE FAILURES LEADING TO PUBLIC OUTCRY IN NIGERIA (A CASE STUDY OF ENDBADGOVERNANCE PROTEST IN NIGERIA)
Chap...
ABSTRACT
This project titled Effectiveness Of Internal Control In a Trading C...
ABSTRACT
This study assessed the Principals‟ Supervisory Practice on Teachers‟ Role Performance in Junior Secondary Schools in Kaduna Sta...
ABSTRACT
The survey research design was adopted to determine whether the financial management practices of small firms i...
ABSTRACT
This project aims at examining importance of difference in wages rate at different level of workers in an orga...
ABSTRACT
The aim of this study was to carry out a flood risk assessment for Ofu River Catchment in Nigeria. Shuttle Radar Topographic Mis...
ABSTRACT
THE IMPACT OF INVESTMENT IN EMERGING MARKETS ON PORTFOLIO GROWTH
This research explores the impact of investment in em...